Failure simulation for a phoneme HMM based keyword spotter
نویسندگان
چکیده
A basic problem in keyword spotting is the fact that the keywords itself cannot be completely different from background speech. Therefore, false alarms arise from those parts of the keyword which are also contained in the background. The paper describes the favourable application of a model trellis which enables to test individual phoneme sequences with respect to their influence on the underlying phoneme HMMs in a statistical way. It is shown, that the Viterbi path highly is affected by those partly fitting phoneme groups. The probability of occurrance of these phoneme sequences is captured by a statistical "speech model" consisting of a Markov graph having an order up to 2. In this way sequences of 1, 2, or 3 phonemes are considered. By combining the model trellis and the statistical speech model, the probability of false alarms can be precalculated in advance, thus providing an useful measure for the suitability of the keyword under consideration. When the choice of keywords was optimized by this suitability measure in a practical application (spotting multicom 94.4 data) , the false alarm rate could be reduced by a factor of 3.5.
منابع مشابه
Discriminative keyword spotting
This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve, as this quantity is the most common measure to evaluate keyword spotters. The keyword spotter we devise is based on nonlinearly mapping the input acoustic representat...
متن کاملSpeaker Dependent Bengali Keyword Spotting in Unconstrained English Speech Acknowledgement
A project report submitted during summer internship under the supervision of Prof. Abstract Multi‐lingual interfaces can be of great use in a number of applications. A very important issue for such systems is to first identify the segments of utterances corresponding to a specific language. Language boundary information is also very vital before any further processing can be done. Language spec...
متن کاملImproving Task Independent Utterance Verification Based on On-line Garbage Phoneme Likelihood
Utterance verification based on on-line garbage (OLG) models is often adopted as the benchmark method. However, we find its performance can be remarkably improved by fine-tuning. In this study, OLG phoneme likelihood is proposed. It achieves much better performance and efficiency for task independent utterance verification to reject mis-recognition and OOV utterances than the OLG frame likeliho...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملVoice activation using prosodic features
In this paper we propose a voice activation method based on prosodic keyword verification. In current voice activation systems features like the fundamental frequency contour have not been considered so far. Normally a continuous listening word spotter is used to detect a certain predefined keyword. We conducted an experiment which shows that people emphasize this keyword when they address a re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997